Using Discourse Information for Paraphrase Extraction
نویسندگان
چکیده
Previous work on paraphrase extraction using parallel or comparable corpora has generally not considered the documents’ discourse structure as a useful information source. We propose a novel method for collecting paraphrases relying on the sequential event order in the discourse, using multiple sequence alignment with a semantic similarity measure. We show that adding discourse information boosts the performance of sentence-level paraphrase acquisition, which consequently gives a tremendous advantage for extracting phraselevel paraphrase fragments from matched sentences. Our system beats an informed baseline by a margin of 50%.
منابع مشابه
Squibs: On Paraphrase and Coreference
Paraphrase extraction1 and coreference resolution have applications in Question Answering, Information Extraction, Machine Translation, and so forth. Paraphrase pairs might be coreferential, and coreference relations are sometimes paraphrases. The two overlap considerably (Hirst 1981), but their definitionsmake them significantly different in essence: Paraphrasing concerns meaning, whereas core...
متن کاملOn Paraphrase and Coreference
Paraphrase extraction and coreference resolution have applications in Question Answering, Information Extraction, Machine Translation, and so forth. Paraphrase pairs might be coreferential, and coreference relations are sometimes paraphrases. The two overlap considerably (Hirst 1981), but their definitionsmake them significantly different in essence: Paraphrasing concerns meaning, whereas coref...
متن کاملCUSAT_NLP@DPIL-FIRE2016: Malayalam Paraphrase Detection
This paper describes an approach for paraphrase detection in Malayalam sentences developed as part of FIRE 2016 Shared Task on Paraphrase detection in Indian Languages. The task of paraphrasedetection is finding a sentence with the same meaning of another sentence expressed using same or different words. This detection is done by a semantic approach which is language dependent. Individual words...
متن کاملOn-Demand Information Extraction
At present, adapting an Information Extraction system to new topics is an expensive and slow process, requiring some knowledge engineering for each new topic. We propose a new paradigm of Information Extraction which operates 'on demand' in response to a user's query. On-demand Information Extraction (ODIE) aims to completely eliminate the customization effort. Given a user’s query, the system ...
متن کاملUsing Multiple Metrics in Automatically Building Turkish Paraphrase Corpus
Paraphrasing is expressing similar meanings with different words in different order. In this sense it is viewed as translation in the same language. It is an important issue in natural language processing for automatic machine translation, question answering, text summarization and language generation. Studies in paraphrasing can be classified as paraphrase extraction, paraphrase generation, pa...
متن کامل